Efficient Support for All Levels of Parallelism for Complex Media Applications
نویسنده
چکیده
Real-time execution of contemporary media applications (e.g., high resolution video encoding/conferencing/editing, face/image/speech recognition, and image synthesis like ray-tracing) needs a considerable amount of processing power that surpasses the capabilities of current superscalar processors. Further, high performance processors are often constrained by power/energy consumption, especially in the mobile systems where media applications have become increasingly popular. Fortunately, most media applications have a lot of data-level parallelism (DLP) which can potentially be exploited for energy-efficient high performance. Unfortunately, DLP in these complex applications is often interspersed with control. Therefore, architectures that support only DLP efficiently are unlikely to suffice. This work makes two broad contributions. First, it makes the case that the complexity of contemporary media applications requires energy-efficient support for multiple forms of parallelism, including ILP, TLP, and various forms of DLP such as sub-word SIMD instructions, vectors, and streams. Second, it shows that all of these forms of parallelism can be efficiently integrated within an evolutionary architecture with little additional hardware support compared to conventional paradigms. We propose such an architecture, called ALP, which is a superscalar based CMP/SMT architecture augmented with novel DLP support. Our evaluations show that our design decisions in ALP are effective. For our application suite, relative to a singlethread superscalar, ALP achieves speedups from 5X to 49X, energy reduction of up to 7.4X, and energy-delay product (EDP) reduction of 5X to 361X. We also find that no single type of parallelism gives the best possible speedup or energy efficiency for these applications. Support for all levels of parallelism is essential in obtaining the possible speedups and energy savings. Our results also support the claim that a superscalar based CMP/SMT architecture augmented with evolutionary DLP support is promising for obtaining high performance and energy efficiency in complex media applications. As part of this work, we have also performed a study to find the most energy efficient general purpose architecture for supporting mulitple threads. We make a design space search comparing the energy efficiency of various CMP, SMT and hybrid architectures and identify key factors that contribute to energy efficient implementation of thread support. In the future, we will improve ALP by enhancing the vector support and memory system, study the importance of ILP in the presence of DLP, and evaluate more applications to strengthen our conclusions.
منابع مشابه
Energy Efficient Support for All Levels of Parallelism for Complex Media Applications
Real-time complex media applications are becoming increasingly common on general-purpose systems such as desktop, laptop, and handheld computers. However, real-time execution of such complex media applications needs a considerable amount of processing power that often surpasses the capabilities of current superscalar processors. Further, high performance processors are often constrained by powe...
متن کاملAn Efficient Algorithm for General 3D-Seismic Body Waves (SSP and VSP Applications)
Abstract The ray series method may be generalized using a ray centered coordinate system for general 3D-heterogeneous media. This method is useful for Amplitude Versus Offset (AVO) seismic modeling, seismic analysis, interpretational purposes, and comparison with seismic field observations.For each central ray (constant ray parameter), the kinematic (the eikonal) and dynamic ray tracing system ...
متن کاملAutomatic Discovery of Coarse-Grained Parallelism in Media Applications
With the increasing use of multi-core microprocessors and hardware accelerators in embedded media processing systems, there is an increasing need to discover coarse-grained parallelism in media applications written in C and C++. Common versions of these codes use a pointer-heavy, sequential programming model to implement algorithms with high levels of inherent parallelism. The lack of automated...
متن کاملData - Level and Thread - Level Parallelism in Emerging
Multimedia applications are becoming increasingly important for a large class of general-purpose processors. Contemporary media applications are highly complex and demand high performance. A distinctive feature of these applications is that they have significant parallelism, including thread-, data-, and instruction-level parallelism, that is potentially well-aligned with the increasing paralle...
متن کاملA Compound Decision Support System for Corporate Planning
Providing a plan for any corporate or firm at macro level, as an organization or enterprise resource planning has particular importance nowadays. To meet the enterprise resource planning needs applications software packages provide a set of uniform pre-prepared and pre-designed that covers all business process throughout an organization. To achieve maximum efficiency in the implementation of th...
متن کامل